genetic programming
Node Preservation and its Effect on Crossover in Cartesian Genetic Programming
Kocherovsky, Mark, Bakurov, Illya, Banzhaf, Wolfgang
While crossover is a critical and often indispensable component in other forms of Genetic Programming, such as Linear- and Tree-based, it has consistently been claimed that it deteriorates search performance in CGP. As a result, a mutation-alone $(1+λ)$ evolutionary strategy has become the canonical approach for CGP. Although several operators have been developed that demonstrate an increased performance over the canonical method, a general solution to the problem is still lacking. In this paper, we compare basic crossover methods, namely one-point and uniform, to variants in which nodes are ``preserved,'' including the subgraph crossover developed by Roman Kalkreuth, the difference being that when ``node preservation'' is active, crossover is not allowed to break apart instructions. We also compare a node mutation operator to the traditional point mutation; the former simply replaces an entire node with a new one. We find that node preservation in both mutation and crossover improves search using symbolic regression benchmark problems, moving the field towards a general solution to CGP crossover.
- North America > United States > California > San Francisco County > San Francisco (0.15)
- North America > United States > New York > New York County > New York City (0.05)
- Europe > Switzerland (0.04)
- (4 more...)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.05)
- North America > United States > California > Alameda County > Livermore (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States (0.04)
- Europe > Italy > Marche > Ancona Province > Ancona (0.04)
Identification of Empirical Constitutive Models for Age-Hardenable Aluminium Alloy and High-Chromium Martensitic Steel Using Symbolic Regression
Kabliman, Evgeniya, Kronberger, Gabriel
Process-structure-property relationships are fundamental in materials science and engineering and are key to the development of new and improved materials. Symbolic regression serves as a powerful tool for uncovering mathematical models that describe these relationships. It can automatically generate equations to predict material behaviour under specific manufacturing conditions and optimize performance characteristics such as strength and elasticity. The present work illustrates how symbolic regression can derive constitutive models that describe the behaviour of various metallic alloys during plastic deformation. Constitutive modelling is a mathematical framework for understanding the relationship between stress and strain in materials under different loading conditions. In this study, two materials (age-hardenable aluminium alloy and high-chromium martensitic steel) and two different testing methods (compression and tension) are considered to obtain the required stress-strain data. The results highlight the benefits of using symbolic regression while also discussing potential challenges.
Performance Evaluation of Bitstring Representations in a Linear Genetic Programming Framework
Meli, Clyde, Nezval, Vitezslav, Oplatkova, Zuzana Kominkova, Buttigieg, Victor, Staines, Anthony Spiteri
Different bitstring representations can yield varying computational performance. This work compares three bitstring implementations in C++: std::bitset, boost::dynamic_bitset, and a custom direct implementation. Their performance is benchmarked in the context of concatenation within a Linear Genetic Programming system. Benchmarks were conducted on three platforms (macOS, Linux, and Windows MSYS2) to assess platform specific performance variations. The results show that the custom direct implementation delivers the fastest performance on Linux and Windows, while std::bitset performs best on macOS. Although consistently slower, boost::dynamic_bitset remains a viable and flexible option. These findings highlight the influence of compiler optimisations and system architecture on performance, providing practical guidance for selecting the optimal method based on platform and application requirements.
- Europe > Middle East > Malta (0.06)
- Europe > Czechia (0.05)
- North America > United States > New York > New York County > New York City (0.04)
Equality Graph Assisted Symbolic Regression
de Franca, Fabricio Olivetti, Kronberger, Gabriel
In Symbolic Regression (SR), Genetic Programming (GP) is a popular search algorithm that delivers state-of-the-art results in term of accuracy. Its success relies on the concept of neutrality, which induces large plateaus that the search can safely navigate to more promising regions. Navigating these plateaus, while necessary, requires the computation of redundant expressions, up to 60% of the total number of evaluation, as noted in a recent study. The equality graph (e-graph) structure can compactly store and group equivalent expressions enabling us to verify if a given expression and their variations were already visited by the search, thus enabling us to avoid unnecessary computation. We propose a new search algorithm for symbolic regression called SymRegg that revolves around the e-graph structure following simple steps: perturb solutions sampled from a selection of expressions stored in the e-graph, if it generates an unvisited expression, insert it into the e-graph and generates its equivalent forms. We show that SymRegg is capable of improving the efficiency of the search, maintaining consistently accurate results across different datasets while requiring a choice of a minimalist set of hyperparameters.
- Europe > Austria > Upper Austria (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- South America > Brazil (0.04)
- (2 more...)
EvoSpeak: Large Language Models for Interpretable Genetic Programming-Evolved Heuristics
Xu, Meng, Liu, Jiao, Ong, Yew Soon
Abstract--Genetic programming (GP) has demonstrated strong effectiveness in evolving tree-structured heuristics for complex optimization problems. Y et, in dynamic and large-scale scenarios, the most effective heuristics are often highly complex, hindering interpretability, slowing convergence, and limiting transferability across tasks. T o address these challenges, we present EvoSpeak, a novel framework that integrates GP with large language models (LLMs) to enhance the efficiency, transparency, and adaptability of heuristic evolution. EvoSpeak learns from high-quality GP heuristics, extracts knowledge, and leverages this knowledge to (i) generate warm-start populations that accelerate convergence, (ii) translate opaque GP trees into concise natural-language explanations that foster interpretability and trust, and (iii) enable knowledge transfer and preference-aware heuristic generation across related tasks. We verify the effectiveness of EvoSpeak through extensive experiments on dynamic flexible job shop scheduling (DFJSS), under both single-and multi-objective formulations. The results demonstrate that EvoSpeak produces more effective heuristics, improves evolutionary efficiency, and delivers human-readable reports that enhance usability. EURISTICS are indispensable tools for solving complex decision-making and optimization problems, with applications spanning scheduling [1], routing [2], and resource allocation [3]. They are designed to provide adaptive, domain-specific solutions that balance solution quality and computational efficiency, enabling practitioners to make near-optimal decisions in real time. Among the diverse methodologies for heuristic design, Genetic Programming (GP) [4] has emerged as a particularly powerful paradigm, capable of evolving interpretable symbolic rules that adapt to different problem structures [5]. GP-generated heuristics often rival, and sometimes surpass, learning-based methods such as neural combinatorial optimization [6], especially in terms of transparency and adaptability. Meng Xu is with the Singapore Institute of Manufacturing Technology, Agency for Science, Technology and Research, Singapore (e-mail: xu_meng@simtech.a-star.edu.sg). Jiao Liu is with the College of Computing & Data Science, Nanyang Technological University, Singapore (e-mail: jiao.liu@ntu.edu.sg). Y ew Soon Ong is with the College of Computing and Data Science, Nanyang Technological University, and the Centre for Frontier AI Research, Institute of High Performance Computing, Agency for Science, Technology and Research, Singapore (e-mail: asysong@ntu.edu.sg). Despite these advantages, the practical deployment of GPevolved heuristics faces two persistent challenges: complexity and transferability.
- Asia > Singapore (0.84)
- North America > United States > Montana (0.04)
- North America > United States > California (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Domain-Informed Genetic Superposition Programming: A Case Study on SFRC Beams
Khorshidi, Mohammad Sadegh, Yazdanjue, Navid, Gharoun, Hassan, Nikoo, Mohammad Reza, Chen, Fang, Gandomi, Amir H.
This study presents domain-informed genetic superposition programming (DIGSP), a symbolic regression framework tailored for engineering systems governed by separable physical mechanisms. DIGSP partitions the input space into domain-specific feature subsets and evolves independent genetic programming (GP) populations to model material-specific effects. Early evolution occurs in isolation, while ensemble fitness promotes inter-population cooperation. To enable symbolic superposition, an adaptive hierarchical symbolic abstraction mechanism (AHSAM) is triggered after stagnation across all populations. AHSAM performs analysis of variance- (ANOVA) based filtering to identify statistically significant individuals, compresses them into symbolic constructs, and injects them into all populations through a validation-guided pruning cycle. The DIGSP is benchmarked against a baseline multi-gene genetic programming (BGP) model using a dataset of steel fiber-reinforced concrete (SFRC) beams. Across 30 independent trials with 65% training, 10% validation, and 25% testing splits, DIGSP consistently outperformed BGP in training and test root mean squared error (RMSE). The Wilcoxon rank-sum test confirmed statistical significance (p < 0.01), and DIGSP showed tighter error distributions and fewer outliers. No significant difference was observed in validation RMSE due to limited sample size. These results demonstrate that domain-informed structural decomposition and symbolic abstraction improve convergence and generalization. DIGSP offers a principled and interpretable modeling strategy for systems where symbolic superposition aligns with the underlying physical structure.
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.70)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Multi-population Ensemble Genetic Programming via Cooperative Coevolution and Multi-view Learning for Classification
Khorshidi, Mohammad Sadegh, Yazdanjue, Navid, Gharoun, Hassan, Nikoo, Mohammad Reza, Chen, Fang, Gandomi, Amir H.
This paper introduces Multi-population Ensemble Genetic Programming (MEGP), a computational intelligence framework that integrates cooperative coevolution and the multiview learning paradigm to address classification challenges in high-dimensional and heterogeneous feature spaces. MEGP decomposes the input space into conditionally independent feature subsets, enabling multiple subpopulations to evolve in parallel while interacting through a dynamic ensemble-based fitness mechanism. Each individual encodes multiple genes whose outputs are aggregated via a differentiable softmax-based weighting layer, enhancing both model interpretability and adaptive decision fusion. A hybrid selection mechanism incorporating both isolated and ensemble-level fitness promotes inter-population cooperation while preserving intra-population diversity. This dual-level evolutionary dynamic facilitates structured search exploration and reduces premature convergence. Experimental evaluations across eight benchmark datasets demonstrate that MEGP consistently outperforms a baseline GP model in terms of convergence behavior and generalization performance. Comprehensive statistical analyses validate significant improvements in Log-Loss, Precision, Recall, F1 score, and AUC. MEGP also exhibits robust diversity retention and accelerated fitness gains throughout evolution, highlighting its effectiveness for scalable, ensemble-driven evolutionary learning. By unifying population-based optimization, multi-view representation learning, and cooperative coevolution, MEGP contributes a structurally adaptive and interpretable framework that advances emerging directions in evolutionary machine learning.
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
AutoStub: Genetic Programming-Based Stub Creation for Symbolic Execution
Mächtle, Felix, Loose, Nils, Serr, Jan-Niclas, Sander, Jonas, Eisenbarth, Thomas
Symbolic execution is a powerful technique for software testing, but suffers from limitations when encountering external functions, such as native methods or third-party libraries. Existing solutions often require additional context, expensive SMT solvers, or manual intervention to approximate these functions through symbolic stubs. In this work, we propose a novel approach to automatically generate symbolic stubs for external functions during symbolic execution that leverages Genetic Programming. When the symbolic executor encounters an external function, AutoStub generates training data by executing the function on randomly generated inputs and collecting the outputs. Genetic Programming then derives expressions that approximate the behavior of the function, serving as symbolic stubs. These automatically generated stubs allow the symbolic executor to continue the analysis without manual intervention, enabling the exploration of program paths that were previously intractable. We demonstrate that AutoStub can automatically approximate external functions with over 90% accuracy for 55% of the functions evaluated, and can infer language-specific behaviors that reveal edge cases crucial for software testing.